Compression of 3D Video: Various Approaches to Compress Time Varying Meshes

نویسنده

  • Kiyoharu Aizawa
چکیده

Three dimensional (3D) video is attracting a lot of attention as a new multimedia representation method. 3D video is a sequence of 3D models (frames) that consist of varying vertices and connectivity. The amount of information of 3D video is huge, thus compression of 3D video is indispensable. We have developed three different compression methods. Two of them are waveform coding methods: One of them is Extended Block Matching and the other is two step quantization with runlength coding. These waveform coding can achieve from 1/10 to 1/30 compression ratio without significant degradation. The one of the three is model based compression for very low bit rate application such as mobile phones. Assuming the object is a person, its skeleton and segmentation is obtained. The motion of each parts of the skeleton is analyzed. The 3D video is reproduced by using the first frame and the motions. In this paper, the three schemes will be described. INTRODUCTION 3D imaging has been attracting attention for long time. One of the latest developments is 3D video generated from a number of views [1-3]. 3D video consists of a series of static 3D models that are generated frame by frame. Therefore, the number of vertices and connectivity of the geometry data varies with time. We call such geometry data as Temporal Varying Mesh, TVM in short. TVM will have many applications in 3D content archive, communication, and entertainment. However, the data size of TVM is quite large. For instance, each frame of TVM in [2] consumes 5~10MB depending on its spatial resolution. Therefore, efficient compression is definitely required. Typical methods for video compression have two compression modes, i.e., intraand inter-frame coding [5]. Intra-frame coding is used for reducing spatial redundancy in each frame. From this point of view, 3D mesh compression techniques reported so far [6-8] are intra frame coding. On the other hand, inter-frame coding exploits temporal redundancy. From this point of view, most of the previous inter-frame compression methods have focused on 3D time consistent mesh used for computer animation [8-12]. Ibarria et al. used space-time predictor which finds spatial and temporal redundancy between current and reference frame [9]. Gupta et al. exploited iterative closet point algorithm to group a vertex whose movement can be represented an affine transform matrix within a given threshold [10]. Their basic computation unit was a vertex and they assumed that topological information does not change with time. Guthe and Starßer applied the wavelet transforms and motion compensation framework to animated volume data [12]. Please note that their data are very different from ours. As far as we know, our work is the first attempt of inter-frame compression of TVM. We think this is because TVM generation using a number of synchronized cameras has recently started and is still in their fancy. Applying 3D animation compression to TVM is not appropriate because TVM has several different features compared to computer animation. The most significant one is that no explicit correspondence exists between consecutive frames. The numbers of vertices and topological (connectivity) information are different from each other due to the frame by frame generation of TVM. For this reason, it is difficult to establish vertex correspondence between frames before applying 3D animation compression scheme. Besides, each frame is a highly detailed model with more than 50,000 vertices. In this paper, we describe our three methods to compress TVM data. EXTENDED BLOCK MATCHING [13] We propose a compression method for TVM to encode geometric information in consecutive frames. Our method uses a block matching algorithm that is commonly used in 2D video compression in inter frame coding [5]. We extend the block matching algorithm to 3D space. Therefore, we call this method an extended block matching algorithm (EBMA). Experimentals using some TVM sequences have demonstrated very encouraging results. The coding scheme is composed of the following steps. (1) local surfaces of the current frame is defined by the cubic blocks (2) each local surface is compared to the surface in the previous frame. The comparison between the irregular surfaces is done by using the average of surface normals of the meshes. (3)The best matching surface is determined. The position of the best match is sent as a motion vector. (4) Differences of vertexes between the local surface and the best matched surface is calculated and encoded by DCT. (5) DCT coefficients are encoded. Because the local surface contour is irregular, the vertexes are ordered. The differences are taken between the nearest vertexes of the current local surface and the previous local surface. The index should be encoded too. The results are that the geometry has been compressed to 17.8 bpv (18% of the original data size) without loss of information and to 9.1bpv (10%) with a loss of 0.86 rms [cm]. TWO STEP QUANTIZATION WITH RUNLENGTH CODING [14] Extended block matching makes use of temporal redundancy, but it requires additional information of index for the difference value to correspond to the vertex of the reference surface. Because of this overhead information, compression performance is limited to some extent. Then, we investigated a different approach which makes use of simpler techniques. The surface vertexes are quantized by two steps: coarse step and fine step. The quantized results are encoded by runlength coding. The method is composed of the steps below. (1) The surface of TVM is coarsely quantized, that is , the bounding box of the frame is divided into coarse blocks. Those blocks which contain the surface of the frame is determined. The flags of the effective blocks are encoded by runlength coding. (2)The effective coarse block is further quantized by a finer quantizer, that is the vertexes contained in the coarse blocks are quantized. (3)The vertex quantized values are effectively encoded by runlength coding. In addition, we make use of temporal redundancies: the effective coarse block determination is based on the change of the previous values. Thus, only when the coarse block is effective and the block is not effective in the previous frame, the effective flag is activated. That is, only the changing flags are encoded, Comparing to Extended Block Matching schemes, this method is simpler and much more efficient. Experimental results show that vertices of TVMs which require 96 bits per vertex (bpv) are compressed to 1.9-15.4 bpv while maintaining a small geometric distortion ranging from 0:7x10 to 1:3x10 % of the maximum error. (a) (b) Fig. 1. Sub-blocking: (a) previous (reference) frame, (b) current frame (to be encoded) and its bounding box (a) (b) (c) Fig. 2. Motion compensation of 3D video: (a) batter frame #1; (b) batter fame #2; (c) motion vectors. Fig.3 (a) Original (96 bpv); (b) 15.4 bpv (0.02 cm) (c) 11.1 bpv(0.04 cm); (d) 8.0 bpv (0.08 cm); (e) 5.3 bpv (0.16 cm); (f)1.9 bpv (0.33 cm). MODEL-BASED CODING WITH MOTION TRACKING [15,16] In order to use 3D video under very low bit rate such as mobile phone (latest mobile phone has capability of 3D graphics), further compression such as 1/10000 is required. We investigated different approach to the compression, that is, the motion of the 3D moving object is tracked and the only motion data and the first frame is encoded. On the receiving end, the frame is reconstructed by using the first frame driven by the motion. We assume the target TVM is a moving person and initial skeleton. Afterword, TVM is motion tracked and segmented and the skeleton is adjusted every frame. Segmentation and skeleton alignment of each frame are cyclically optimized. The initial skeleton of the next frame is predicted by the current skeleton and the previous motion. Again, segmentation of the body parts and the skeleton alignment are cyclically optimized. When applying to the mobile phone, mesh reduction of the first frame of the TVM is conducted and the first frame is encoded. The initial frame is animated by the motion data afterword for a while. Fig.4 shows the motion of the skeleton of two TVMs. By using this model-based approach, the information required for 3D video is highly compressed. For example, motion data is about 300 bytes each frame. CONCLUSION In this paper, we described our works related to 3D video compression. Two are so call waveform coding and one is model-based coding. The latter is suited to very low bit rate application such as mobile phones.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Intra Mode Decision for Depth Map coding in 3D-HEVC Standard

three dimensional- high efficiency video coding (3D-HEVC) is the expanded version of the latest video compression standard, namely high efficiency video coding (HEVC), which is used to compress 3D videos. 3D videos include texture video and depth map. Since the statistical characteristics of depth maps are different from those of texture videos, new tools have been added to the HEVC standard fo...

متن کامل

An efficient subdivision inversion for wavemesh-based progressive compression of 3D triangle meshes

Wavemesh is a powerful scheme for 3D triangular mesh processing. In sharp contrast with other approaches using wavelets for mesh compression which apply only to meshes having subdivision connectivity, Wavemesh can simplify, approximate, and compress meshes even if they do not respect this constraint, with unmatched results for progressive lossless compression when compared to other approaches. ...

متن کامل

Temporal DCT-Based Compression of 3D Dynamic Meshes

This paper introduces a new compression scheme for 3D dynamic meshes with constant connectivity and time-varying geometry. The proposed approach, referred to as Temporal-DCT encoder (TDCT), combines a piecewise affine prediction scheme with a temporal DCT-based compression of the prediction errors. Experiments show that TDCT achieves up to 70% and 54% lower compression distortions than the GV a...

متن کامل

Geometry Compression of Normal Meshes Using Rate-Distortion Algorithms

We propose a new rate-distortion based algorithm for compressing 3D surface geometry represented using triangular normal meshes. We apply the Estimation-Quantization (EQ) algorithm to compress normal mesh wavelet coefficients. The EQ algorithm models the wavelet coefficients as a Gaussian random field with slowly varying standard deviation that depends on the local neighborhood and uses rate-di...

متن کامل

Technologies for 3D mesh compression: A survey

Three-dimensional (3D) meshes have been widely used in graphic applications for the representation of 3D objects. They often require a huge amount of data for storage and/or transmission in the raw data format. Since most applications demand compact storage, fast transmission, and efficient processing of 3D meshes, many algorithms have been proposed to compress 3D meshes efficiently since early...

متن کامل

Real-Time 3D Video Compression for Tele-Immersive Environments

Tele-immersive systems can improve productivity and aid communication by allowing distributed parties to exchange information via a shared immersive experience. The TEEVE research project at the University of Illinois at Urbana-Champaign and the University of California at Berkeley seeks to foster the development and use of tele-immersive environments by a holistic integration of existing compo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009